{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Reading Data from Web Sites\n",
"\n",
"Many times you want to add external data to your analysis. External data is often found in external web sites, usually in the format of an HTML table. You can import HTML directly into Excel using _\"Import\"_.\n",
"\n",
"In this notebook, we will learn how to use pandas _read_html_ to load tables from web site with ease, including cases where the data is complex to retrieve. "
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: lxml in /home/studio-lab-user/.conda/envs/default/lib/python3.9/site-packages (4.6.4)\n",
"Collecting requests\n",
" Using cached requests-2.26.0-py2.py3-none-any.whl (62 kB)\n",
"Collecting certifi>=2017.4.17\n",
" Using cached certifi-2021.10.8-py2.py3-none-any.whl (149 kB)\n",
"Collecting idna<4,>=2.5\n",
" Using cached idna-3.3-py3-none-any.whl (61 kB)\n",
"Collecting charset-normalizer~=2.0.0\n",
" Using cached charset_normalizer-2.0.8-py3-none-any.whl (39 kB)\n",
"Collecting urllib3<1.27,>=1.21.1\n",
" Using cached urllib3-1.26.7-py2.py3-none-any.whl (138 kB)\n",
"Installing collected packages: urllib3, idna, charset-normalizer, certifi, requests\n",
"Successfully installed certifi-2021.10.8 charset-normalizer-2.0.8 idna-3.3 requests-2.26.0 urllib3-1.26.7\n",
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"%pip install lxml requests"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Loading HTML data\n",
"\n",
"We will load a few tables from simple sites as well as complex ones"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"import warnings\n",
"warnings.filterwarnings('ignore')"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Loading from simple web sites \n",
"\n",
"If you have a simple web page with a few tables, you can pass the URL of the page to Pandas and call read_html. "
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"url = 'https://en.wikipedia.org/wiki/The_Championships,_Wimbledon'\n",
"dfs = (\n",
" pd\n",
" .read_html(url)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"read_html is returning a list of all the tables in the page. We can scroll through the content of the tables on that page, by changing the index in the list. We can print the second element (index 1 in the list), we will get the table with the prize amounts for the Wimbeldon Tennis competition."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
0
\n",
"
1
\n",
"
2
\n",
"
3
\n",
"
4
\n",
"
5
\n",
"
6
\n",
"
7
\n",
"
8
\n",
"
9
\n",
"
10
\n",
"
11
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
2021 Event
\n",
"
W
\n",
"
F
\n",
"
SF
\n",
"
QF
\n",
"
Round of 16
\n",
"
Round of 32
\n",
"
Round of 64
\n",
"
Round of 1281
\n",
"
Q3
\n",
"
Q2
\n",
"
Q1
\n",
"
\n",
"
\n",
"
1
\n",
"
Singles
\n",
"
£1,700,000
\n",
"
£900,000
\n",
"
£465,000
\n",
"
£300,000
\n",
"
£181,000
\n",
"
£115,000
\n",
"
£75,000
\n",
"
£48,000
\n",
"
£25,500
\n",
"
£15,500
\n",
"
£8,500
\n",
"
\n",
"
\n",
"
2
\n",
"
Doubles
\n",
"
£480,000
\n",
"
£240,000
\n",
"
£120,000
\n",
"
£60,000
\n",
"
£30,000
\n",
"
£19,000
\n",
"
£12,000
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" 0 1 2 3 4 5 \\\n",
"0 2021 Event W F SF QF Round of 16 \n",
"1 Singles £1,700,000 £900,000 £465,000 £300,000 £181,000 \n",
"2 Doubles £480,000 £240,000 £120,000 £60,000 £30,000 \n",
"\n",
" 6 7 8 9 10 11 \n",
"0 Round of 32 Round of 64 Round of 1281 Q3 Q2 Q1 \n",
"1 £115,000 £75,000 £48,000 £25,500 £15,500 £8,500 \n",
"2 £19,000 £12,000 NaN NaN NaN NaN "
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dfs[1]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Loading from complex web sites\n",
"\n",
"Many times pages will be more complex and we can still extract the relevant data. "
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [],
"source": [
"url = 'https://ncov2019.live/data'\n",
"table_id = 'sortable_table_world'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here we will look like a browser to the web site, as some web sites block their context to bots and crawlers. Since we are not hitting the web site a lot, pretending to be a browser is considered as acceptable usage. \n",
"\n",
"We will send to the web site a header that a browser is sending and get the reply of the page as text. The text (_r.text_) will be parsed by the _read_html_ function and create the dataframe. We will also add the HTML id of the table that we want. This id (_'sortable_table_world'_) can be found when using the _inspect_ option in Chrome, Safari, Firefox and other browsers. "
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"import requests\n",
"\n",
"header = {\n",
" \"User-Agent\": \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36\",\n",
" \"X-Requested-With\": \"XMLHttpRequest\"\n",
"}\n",
"\n",
"response = requests.get(url, headers=header)\n",
"\n",
"df_list = (\n",
" pd\n",
" .read_html(\n",
" response.text, \n",
" attrs={'id': table_id}\n",
" )\n",
")\n"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Name
\n",
"
Confirmed
\n",
"
Per Million
\n",
"
Deceased
\n",
"
Per Million.1
\n",
"
Tests
\n",
"
Active
\n",
"
Recovered
\n",
"
Per Million.2
\n",
"
Vaccinated
\n",
"
Population
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
TOTAL
\n",
"
263,793,010 +72,704
\n",
"
33535.241
\n",
"
5,242,802 +963
\n",
"
666.502
\n",
"
4,313,944,340 +1,570,359
\n",
"
22039849
\n",
"
236,363,499 +82,533
\n",
"
30048.207
\n",
"
4291126845
\n",
"
7866143246
\n",
"
\n",
"
\n",
"
1
\n",
"
★ Afghanistan
\n",
"
157359
\n",
"
3917.630
\n",
"
7309
\n",
"
181.970
\n",
"
793150
\n",
"
9469
\n",
"
140581
\n",
"
3499.930
\n",
"
4285440
\n",
"
40166836
\n",
"
\n",
"
\n",
"
2
\n",
"
★ Albania
\n",
"
200173
\n",
"
69665.420
\n",
"
3101
\n",
"
1079.230
\n",
"
1395649
\n",
"
6694
\n",
"
190378
\n",
"
66256.510
\n",
"
1071871
\n",
"
2873348
\n",
"
\n",
"
\n",
"
3
\n",
"
★ Algeria
\n",
"
210723
\n",
"
4685.830
\n",
"
6076
\n",
"
135.110
\n",
"
230861
\n",
"
60045
\n",
"
144602
\n",
"
3215.500
\n",
"
6716299
\n",
"
44970242
\n",
"
\n",
"
\n",
"
4
\n",
"
★ Andorra
\n",
"
17426
\n",
"
0.000
\n",
"
131
\n",
"
0.000
\n",
"
193595
\n",
"
1266
\n",
"
16029
\n",
"
0.000
\n",
"
54999
\n",
"
77440
\n",
"
\n",
"
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
\n",
"
\n",
"
221
\n",
"
★ Montserrat
\n",
"
44
\n",
"
0.000
\n",
"
1
\n",
"
0.000
\n",
"
2920
\n",
"
Unknown
\n",
"
41
\n",
"
0.000
\n",
"
Unknown
\n",
"
4996
\n",
"
\n",
"
\n",
"
222
\n",
"
★ Western Sahara
\n",
"
10
\n",
"
0.000
\n",
"
1
\n",
"
0.000
\n",
"
Unknown
\n",
"
Unknown
\n",
"
8
\n",
"
0.000
\n",
"
Unknown
\n",
"
618081
\n",
"
\n",
"
\n",
"
223
\n",
"
★ Saint Helena
\n",
"
2
\n",
"
0.000
\n",
"
Unknown
\n",
"
0.000
\n",
"
Unknown
\n",
"
Unknown
\n",
"
2
\n",
"
0.000
\n",
"
4361
\n",
"
6103
\n",
"
\n",
"
\n",
"
224
\n",
"
★ Micronesia
\n",
"
1
\n",
"
0.000
\n",
"
Unknown
\n",
"
0.000
\n",
"
Unknown
\n",
"
Unknown
\n",
"
1
\n",
"
0.000
\n",
"
Unknown
\n",
"
116729
\n",
"
\n",
"
\n",
"
225
\n",
"
★ China
\n",
"
98,897 +73
\n",
"
68.710
\n",
"
4636
\n",
"
3.220
\n",
"
160000000
\n",
"
906
\n",
"
93,355 +35
\n",
"
64.860
\n",
"
1225000000
\n",
"
1439323776
\n",
"
\n",
" \n",
"
\n",
"
226 rows × 11 columns
\n",
"
"
],
"text/plain": [
" Name Confirmed Per Million Deceased \\\n",
"0 TOTAL 263,793,010 +72,704 33535.241 5,242,802 +963 \n",
"1 ★ Afghanistan 157359 3917.630 7309 \n",
"2 ★ Albania 200173 69665.420 3101 \n",
"3 ★ Algeria 210723 4685.830 6076 \n",
"4 ★ Andorra 17426 0.000 131 \n",
".. ... ... ... ... \n",
"221 ★ Montserrat 44 0.000 1 \n",
"222 ★ Western Sahara 10 0.000 1 \n",
"223 ★ Saint Helena 2 0.000 Unknown \n",
"224 ★ Micronesia 1 0.000 Unknown \n",
"225 ★ China 98,897 +73 68.710 4636 \n",
"\n",
" Per Million.1 Tests Active Recovered \\\n",
"0 666.502 4,313,944,340 +1,570,359 22039849 236,363,499 +82,533 \n",
"1 181.970 793150 9469 140581 \n",
"2 1079.230 1395649 6694 190378 \n",
"3 135.110 230861 60045 144602 \n",
"4 0.000 193595 1266 16029 \n",
".. ... ... ... ... \n",
"221 0.000 2920 Unknown 41 \n",
"222 0.000 Unknown Unknown 8 \n",
"223 0.000 Unknown Unknown 2 \n",
"224 0.000 Unknown Unknown 1 \n",
"225 3.220 160000000 906 93,355 +35 \n",
"\n",
" Per Million.2 Vaccinated Population \n",
"0 30048.207 4291126845 7866143246 \n",
"1 3499.930 4285440 40166836 \n",
"2 66256.510 1071871 2873348 \n",
"3 3215.500 6716299 44970242 \n",
"4 0.000 54999 77440 \n",
".. ... ... ... \n",
"221 0.000 Unknown 4996 \n",
"222 0.000 Unknown 618081 \n",
"223 0.000 4361 6103 \n",
"224 0.000 Unknown 116729 \n",
"225 64.860 1225000000 1439323776 \n",
"\n",
"[226 rows x 11 columns]"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_list[0]"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"covid_19_status = df_list[0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Simple Data Visualization\n",
"\n",
"External data is often \"messy\", and it is a best practice to clean it up before staring to work with it. In the next cell we will make sure that the numeric columns are indeed numeric (removing \"Unknown\", for example). We will also resolve the conflict of having multiple columns with the same name (\"Per Million\", in this exammple), and translate to more meaningful names.\n",
"* Start with the Covid-19 status table\n",
"* Convert the first 'Per Million' column from the table to be numeric\n",
"* Convert the second 'Per Million' column from the table to be numeric\n",
"* Fill missing values in the table with 0\n",
"* Sort the table in decreasing order by number of deceased cases\n",
"* Take the first top 20 country rows\n",
"* Plot the results\n",
"* as bar graph with the name of the country and the number of deceased cases per million people\n"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAX0AAAGVCAYAAAASbiG+AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8/fFQqAAAACXBIWXMAAAsTAAALEwEAmpwYAABNlElEQVR4nO2deZgU1fW/3w+IIIoIiEYBxbhLZFFcwZUENBqXRKPEPSYaY+Iaoyb5BTT6dYn7Eo1R3OOaGPe478YFBEFFIu4oAgIiRFGB8/vj3h56hp7pqu5ieoY67/P001236p46tfSpW+eee67MDMdxHCcftKm1Ao7jOE7z4UbfcRwnR7jRdxzHyRFu9B3HcXKEG33HcZwc4UbfcRwnR7jRzxGSTNJ6tdYjCyRdJ+mMWuvRmlmW7gcnOW70a4Sk9yR9KWmupM8kPS/pF5IyuSaSnpT0syplbCDpDkmfSpojabykEyS1zULHloICx0h6TdL/JE2Jx71p0TbbSno8Xq85ku6VtElct3+8nmogdzlJ0yXtLmlHSVOK1j0paX6U97mkMZJOkdS+jK5XSZokaZGkQxusay/pQkkfS5ot6S+S2mVyksogaQ1J10iaGo/pTUmnSVpxKe93pKSbluY+ljXc6NeWH5hZJ2Bt4GzgZOCa2qoUkLQu8CLwIbCpmXUG9gUGAp1qqdtS4GLgWOAYoCuwAfAvYDcASdsADwN3A2sC6wCvAs9J+nbcdhVghwZydwEM+Hcj+/1VvP5rACcC+wMPNHx4NOBV4JfAKyXWnUK4Pt+Jx7AZ8IcmZGWCpK7Af4AVgG3iMX2PcE7WXdr7b4r4QHc7V4yZ+acGH+A94LsNyrYEFgHficvtgfOAD4BpwJXACnFdF+A+YAYwO/7uGdedCSwE5gPzgMtiuQG/AN4CPgMuB9SIfjcB95c5hjuAT4A5wNNAn6J13wfeAOYCHwG/KVq3OzAu6vA80Ldo3clx+7nAJGBII/u+Lp6PR+K2TwFrx3WXA+c32P4e4PgSctaP52rLJo7zGeAvJcofBG6Iv68CRjVYfztwYfy9IzClaN2TwM8abL8W8AWwe4L751ng0AZlo4F9i5Z/AnzYhAwjPOjeAT4F/kxoCC4PzCI87AvbrhZ1615CzhnABKBNE/vaFng53isvA9s29l8ARgI3xd+9o56HEP4HnwK/j+t2Ab4GviHc568WndszgeeAL4GTgDEN9DkBuLs5//Mt5VNzBfL6aXijF5V/ABwVf18YjVVXQuv6XuCsuK4b8COgY1x3B/CvIjmljIoRHg6rRAMzA9ilEf0+AQ4rcww/jftuD1wEjCtaNxXYLv7uAmwWfw8ApgNbAW3jn/m9KGNDwpvFmnHb3sC6jez7OoKx3z7WvRh4Nq7bEvi4YISAVaPBWr2EnF8A7zdxjB0JD4WdSqw7DJgafw8CPmfxQ7lzNDj94/KOlDH6sfxp4JwE909jRv/HRcsHxGveuREZBjwR76+1gP8WdAL+UqwH4U3o3kbkvACc1oSuXQkNk4OA5YDhcblbqf8CpY3+3whvEv2Ar4CNG27b4Nx+APSJ+2tPeIhtXLTNWOBHzfmfbykff+1peXwMdI2v+EcQWqezzGwu8H8EFwBmNtPM/mFmX8R1Z7Kke6EUZ5vZZ2b2AeEP37+R7boRDHejmNkoM5trZl8R/nz9JHWOq78BNpG0spnNNrOCO+II4K9m9qKZLTSz6wl/4q0JxrV9rNfOzN4zs7ebUOF+M3s67v/3wDaSepnZS4QW5ZC43f7Ak2Y2rYLj7Epo/ZbaZirhgYKZPUd4G9s7rvsx8F8zG9eE7FJ8HPdZCf8GjpXUXdK3CK14CA+uxjgn3l8fEB7cw2P59cDwIlfTQcCNjcgodw53A94ysxvNbIGZ3QK8Cfyg7BEt5jQz+9LMXiW4uPqV2f46M3s97u8r4DbgQABJfQgPk/tS7H+ZwY1+y6MHoVXSnfBnHRM7ej8j/Km7A0jqKOmvkt6X9DmhhbhKgk7WT4p+fwGs1Mh2Mwm+5pJIaivpbElvx/2/F1etGr9/RHDxvC/pqegXh9B/cWLhmOJx9SK07icDxxEeINMl3SppzSaO5cPCDzObRzhvhe2vJ/7J43djBqvJ4yS0SBc1ss0aBHdDgRuAg+Pvg+JyWgrXH0nzij5rJah7JqEFO47gNvsX4eFb6mFX4MOi3+8Tz5+ZvUi4P3aUtBGwHuGtsxTlzuGaUXYx7xOONSlJ79sCHzZYvh74SXyIHQTcHh8GucONfgtC0haEP8KzBGPyJcFPvkr8dDazws1+IsEdspWZrUxwcwAUWmbVpk99lGC4G+MnwJ7AdwmujN7F+zezl81sT4Iv+F8E/zaEP+OZRce0ipl1jK0/zOzvZjaY8HAw4JwmdOhV+CFpJUIL+eNYdBOwp6R+wMZRh1I8BvSUNLDUSjP7H6GTct8Sq38c6xe4ERgSH3BbAzc3ofsSSOoFbE7oQ8DMVir6fFCufmwJ/8rMepjZtwnGeIyZLWqiWq+i32ux+PzB4gfnQcCdZja/ERmPAns30WH6MeF6FrMWoe8G4H/Ufxv5VhP6NqSx+7xeuZm9QPD/b0e4dxtrBCzzuNFvAUhaWdLuwK0E/+SE+Ef9G3ChpNXidj0kDYvVOhEeCp/F6IkRDcROA75dhVojgG0l/Tm6CpC0nqSbJK0S9/8VwbB0JLieCsezvKQDJHU2s28Ivu6C4fkb8AtJW8XIihUl7Sapk6QNJe0cwxbnx+NrymB9X9JgScsDfwJeMLMPAcxsCqHD8EbgH2b2ZSkBZvYWwX99SwyrXF5ShxiGeUrc7BTgEIWwzk6SuiiMEdgGOK1I1nuEB/YtwCNm9gkJiG9tOxCig14CHmhi2+UldSA8XNtFXdvEdT0krRnP69bA/2PJ+6IhJ8Xj6UXw299WtO4mgrvqQJp+a7kAWBm4XtLaRbpcIKlvPJ4NJP1EIYx1P2ATFrtXxgH7S2oXH777lNG5mGlA74QROjcAlwHfmNmzKfaxbFHrToW8fgjukC8JnZFzCK3Jo4G2Rdt0IBjTdwiGcyJwTFy3JqHDah6hA+5IQutmubh+m1g+G7gklhmwXpH864AzmtBxQ0IH8cyo46sE90tbwuv13VH/9wluDSO4AZYnuKJmR71fBgYXyd0lln1G8AXfQXiI9CUYvbkEF8d9xE7dErpdx+LonXkE99Y6DbY5MOq0RCdsg+1EMHivE1wHHxGMX3E00uCi8/05cD8xyqqBrEPjPvdrUL4jS3bkzo/HOpfglvk90KGMrk9G+cWfHeO67eN99QUh8umAMrKKo3dmAudTdP/FbR6NMktGeRVttyYwiuCGmUvw2Y8AOhadvzHxPhrT4H74NiE8eF48r5ewZEfucg3OQaHDuRvhQTsbeKXh+gY6rkVoRDTa6ZyHj+LJcJxlDknbE1qra5vf6BUhaRTwsZkt9Xj/pY2kFQiRY5tZeMPLJcvVWgHHWRoojEQ9FrjaDX5lSOoN/JAQZrsscBTwcp4NPrjRd5ZBJG1MiFl/lRBL76RE0p+A4wnjQt6ttT7VIuk9ghtvr9pqUnvcveM4jpMjPHrHcRwnR7jRdxzHyREt2qe/6qqrWu/evWuthuM4TqtizJgxn5pZ91LrWrTR7927N6NHj661Go7jOK0KSQ3TXtTh7h3HcZwc4UbfcRwnR7jRdxzHyREt2qfvOK2db775hilTpjB/fmMJKh2ncjp06EDPnj1p1y75VMiJjH7Mqng1Ye5NI8yYNImQlKo3ISHTj81sdsxXfTEhl/oXhNl9XolyDmHxnJ1nWJhAw3GWWaZMmUKnTp3o3bs3anLqW8dJh5kxc+ZMpkyZwjrrrJO4XlL3zsXAv81sI8KMNRMJ6WYfM7P1CTnFC2lodyXMO7o+YZakK6Bu8uQRhGnytgRGSOqSWFPHaYXMnz+fbt26ucF3MkcS3bp1S/0WWdbox+nvtgeuATCzr83sM8IEGoWW+vUszmmxJ2GyaLMwccEqktYAhhFyjM8ys9mElLi7pNLWcVohbvCdpUUl91aSlv46hAm0r5U0VtLVklYkTDJdmBfzE2D1+LsH9acqmxLLGit3HMdxmokkPv3lgM2AX5vZi5IuZrErBwAzM0mZZG6TdATBLcRaay05LWjvU+5vsv57Z++WhRqOs1Qod/+mJe39PnLkSFZaaSV+85vfZKpHlhQGZa666qrlN3ZSk6SlP4Uw48+LcflOwkNgWnTbEL+nx/UfUX/ezZ6xrLHyepjZVWY20MwGdu9echSx4zhOq2XBggU13X9Zo29hns8PJW0Yi4YAbwD3AIfEskMIU+cRyw8umqdzTnQDPQQMjfNxdgGGxjLHcZYiZ555JhtssAGDBw9m0qRJALz99tvssssubL755my33Xa8+eabAEybNo29996bfv360a9fP55//nkAbrrpJrbcckv69+/PkUceycKFCwE46qijGDhwIH369GHEiMXT8Z5yyilssskm9O3bt+6tYsaMGfzoRz9iiy22YIsttuC5554DYObMmQwdOpQ+ffrws5/9jHLp3m+44Qb69u1Lv379OOiggwC499572WqrrRgwYADf/e53mTZtGgBPPfUU/fv3p3///gwYMIC5c+cC8Oc//5ktttiCvn371un9v//9j912241+/frxne98h9tuu620AoS3kd/+9rdsuummbLnllkyePLnJYxw5ciQHHXQQgwYNqtO5Iddddx177bUX3/ve9+jduzeXXXYZF1xwAQMGDGDrrbdm1qxZTZ6XpCSN0/81cHOcgPodwsQUbYDbJR1OmCP1x3HbBwjhmpMJIZuHAZjZrDgxw8txu9PNLJujcBynJGPGjOHWW29l3LhxLFiwgM0224zNN9+cI444giuvvJL111+fF198kV/+8pc8/vjjHHPMMeywww7cddddLFy4kHnz5jFx4kRuu+02nnvuOdq1a8cvf/lLbr75Zg4++GDOPPNMunbtysKFCxkyZAjjx4+nR48e3HXXXbz55ptI4rPPPgPg2GOP5fjjj2fw4MF88MEHDBs2jIkTJ3LaaacxePBg/vjHP3L//fdzzTXXNHo8r7/+OmeccQbPP/88q666ap0hHDx4MC+88AKSuPrqqzn33HM5//zzOe+887j88ssZNGgQ8+bNo0OHDjz88MO89dZbvPTSS5gZe+yxB08//TQzZsxgzTXX5P77gwtuzpw5TZ7bzp07M2HCBG644QaOO+447rvvvkaPEeCNN97g2WefZYUVVmhU5muvvcbYsWOZP38+6623Hueccw5jx47l+OOPr9tPtSQy+mY2DhhYYtWQEtsaYYLvUnJGESZPdhynGXjmmWfYe++96dixIwB77LEH8+fP5/nnn2ffffet2+6rr74C4PHHH+eGG24AoG3btnTu3Jkbb7yRMWPGsMUWWwDw5ZdfstpqqwFw++23c9VVV7FgwQKmTp3KG2+8wSabbEKHDh04/PDD2X333dl9990BePTRR3njjTfq9vn5558zb948nn76af75z38CsNtuu9GlS+OR3I8//jj77rtvnb+/a9euQBgPsd9++zF16lS+/vrrurj1QYMGccIJJ3DAAQfwwx/+kJ49e/Lwww/z8MMPM2BAmAVy3rx5vPXWW2y33XaceOKJnHzyyey+++5st912TZ7b4cOH130ff/zxTR5j4dw3ZfABdtppJzp16kSnTp3o3LkzP/jBDwDYdNNNGT9+fJN1k+Ijch0nZyxatIhVVlmFcePGJdrezDjkkEM466yz6pW/++67nHfeebz88st06dKFQw89lPnz57Pccsvx0ksv8dhjj3HnnXdy2WWX8fjjj7No0SJeeOEFOnTokPkx/frXv+aEE05gjz324Mknn2TkyJFAcDPttttuPPDAAwwaNIiHHnoIM+PUU0/lyCOPXELOK6+8wgMPPMAf/vAHhgwZwh//+MdG91kcLln43dQxrrjiimWPo3379nW/27RpU7fcpk2bzPoCPPeO4yzDbL/99vzrX//iyy+/ZO7cudx777107NiRddZZhzvuuAMIRv3VV18FYMiQIVxxxRUALFy4kDlz5jBkyBDuvPNOpk8PsRqzZs3i/fff5/PPP2fFFVekc+fOTJs2jQcffBAILec5c+bw/e9/nwsvvLBO9tChQ7n00kvrdCs8dLbffnv+/ve/A/Dggw8ye/bsRo9n55135o477mDmzJl1ukBwxfToESLAr79+8UD/t99+m0033ZSTTz6ZLbbYgjfffJNhw4YxatSouhb4Rx99xPTp0/n444/p2LEjBx54ICeddBKvvPJKk+e24PO/7bbb2GabbZo8xpaEt/Qdpxlp7pDizTbbjP32249+/fqx2mqr1blobr75Zo466ijOOOMMvvnmG/bff3/69evHxRdfzBFHHME111xD27ZtueKKK9hmm20444wzGDp0KIsWLaJdu3ZcfvnlbL311gwYMICNNtqIXr16MWjQIADmzp3Lnnvuyfz58zEzLrjgAgAuueQSjj76aPr27cuCBQvYfvvtufLKKxkxYgTDhw+nT58+bLvttiVDtQv06dOH3//+9+ywww60bduWAQMGcN111zFy5Ej23XdfunTpws4778y774a53C+66CKeeOIJ2rRpQ58+fdh1111p3749EydOrDPUK620EjfddBOTJ0/mpJNOok2bNrRr167u4dcYs2fPpm/fvrRv355bbrmlyWNsSbToidEHDhxoDSdR8Th9pzUxceJENt5441qr4WRMSxpLUOoekzTGzEr1w7p7x3EcJ0+4e8dxnBbHzJkzGTJkieBAHnvsMbp169Zseuy99951rqIC55xzDu+9917FMh966CFOPvnkemXrrLMOd911V8Uy0+BG33GcFke3bt1aRCfo0jDEw4YNY9iwYZnLTYq7dxxnKdOS+82c1k0l95YbfcdZinTo0IGZM2e64XcypzCJStpxD+7ecZylSM+ePZkyZQozZsyotSrOMkhhusQ0uNF3nKVIu3btUk1l5zhLG3fvOI7j5Ag3+o7jODnCjb7jOE6OcKPvOI6TI9zoO47j5Ag3+o7jODnCjb7jOE6OcKPvOI6TI9zoO47j5Ag3+o7jODnCjb7jOE6OcKPvOI6TI9zoO47j5Ag3+o7jODnCjb7jOE6OSGT0Jb0naYKkcZJGx7Kukh6R9Fb87hLLJekSSZMljZe0WZGcQ+L2b0k6ZOkckuM4jtMYaVr6O5lZfzMbGJdPAR4zs/WBx+IywK7A+vFzBHAFhIcEMALYCtgSGFF4UDiO4zjNQzXunT2B6+Pv64G9ispvsMALwCqS1gCGAY+Y2Swzmw08AuxSxf4dx3GclCQ1+gY8LGmMpCNi2epmNjX+/gRYPf7uAXxYVHdKLGusvB6SjpA0WtJon1fUcRwnW5LOkTvYzD6StBrwiKQ3i1eamUmyLBQys6uAqwAGDhyYiUzHcRwnkKilb2Yfxe/pwF0En/y06LYhfk+Pm38E9Cqq3jOWNVbuOI7jNBNljb6kFSV1KvwGhgKvAfcAhQicQ4C74+97gINjFM/WwJzoBnoIGCqpS+zAHRrLHMdxnGYiiXtndeAuSYXt/25m/5b0MnC7pMOB94Efx+0fAL4PTAa+AA4DMLNZkv4EvBy3O93MZmV2JI7jOE5Zyhp9M3sH6FeifCYwpES5AUc3ImsUMCq9mo7jOE4W+Ihcx3GcHOFG33EcJ0e40Xccx8kRbvQdx3FyhBt9x3GcHOFG33EcJ0e40Xccx8kRbvQdx3FyhBt9x3GcHOFG33EcJ0e40Xccx8kRbvQdx3FyhBt9x3GcHOFG33EcJ0e40Xccx8kRbvQdx3FyhBt9x3GcHOFG33EcJ0e40Xccx8kRbvQdx3FyhBt9x3GcHOFG33EcJ0e40Xccx8kRbvQdx3FyxHK1VqC56X3K/WW3ee/s3ZpBE8dxnOYncUtfUltJYyXdF5fXkfSipMmSbpO0fCxvH5cnx/W9i2ScGssnSRqW+dE4juM4TZLGvXMsMLFo+RzgQjNbD5gNHB7LDwdmx/IL43ZI2gTYH+gD7AL8RVLb6tR3HMdx0pDI6EvqCewGXB2XBewM3Bk3uR7YK/7eMy4T1w+J2+8J3GpmX5nZu8BkYMsMjsFxHMdJSNKW/kXAb4FFcbkb8JmZLYjLU4Ae8XcP4EOAuH5O3L6uvEQdx3Ecpxkoa/Ql7Q5MN7MxzaAPko6QNFrS6BkzZjTHLh3HcXJDkpb+IGAPSe8BtxLcOhcDq0gqRP/0BD6Kvz8CegHE9Z2BmcXlJerUYWZXmdlAMxvYvXv31AfkOI7jNE5Zo29mp5pZTzPrTeiIfdzMDgCeAPaJmx0C3B1/3xOXiesfNzOL5fvH6J51gPWBlzI7EsdxHKcs1cTpnwzcKukMYCxwTSy/BrhR0mRgFuFBgZm9Lul24A1gAXC0mS2sYv+O4zhOSlIZfTN7Engy/n6HEtE3ZjYf2LeR+mcCZ6ZVsqVRboCXD+5yHKel4mkYHMdxcoQbfcdxnBzhRt9xHCdHuNF3HMfJEW70HcdxcoQbfcdxnBzhRt9xHCdH5G4SlZaCx/o7jlML3Oi3UnwGMMdxKsHdO47jODnCjb7jOE6OcKPvOI6TI9zoO47j5Ag3+o7jODnCjb7jOE6OcKPvOI6TI9zoO47j5Ag3+o7jODnCjb7jOE6OcKPvOI6TI9zoO47j5Ag3+o7jODnCjb7jOE6OcKPvOI6TI9zoO47j5Ag3+o7jODmi7MxZkjoATwPt4/Z3mtkISesAtwLdgDHAQWb2taT2wA3A5sBMYD8zey/KOhU4HFgIHGNmD2V/SE5SfPYtx8kfSVr6XwE7m1k/oD+wi6StgXOAC81sPWA2wZgTv2fH8gvjdkjaBNgf6APsAvxFUtsMj8VxHMcpQ9mWvpkZMC8utosfA3YGfhLLrwdGAlcAe8bfAHcCl0lSLL/VzL4C3pU0GdgS+E8WB+LUBp/g3XFaF4l8+pLaShoHTAceAd4GPjOzBXGTKUCP+LsH8CFAXD+H4AKqKy9Rp3hfR0gaLWn0jBkzUh+Q4ziO0zhlW/oAZrYQ6C9pFeAuYKOlpZCZXQVcBTBw4EBbWvtxWg7+tuA4zUeq6B0z+wx4AtgGWEVS4aHRE/go/v4I6AUQ13cmdOjWlZeo4ziO4zQDZY2+pO6xhY+kFYDvARMJxn+fuNkhwN3x9z1xmbj+8dgvcA+wv6T2MfJnfeCljI7DcRzHSUAS984awPUx0qYNcLuZ3SfpDeBWSWcAY4Fr4vbXADfGjtpZhIgdzOx1SbcDbwALgKOj28hxHMdpJpJE74wHBpQof4cQfdOwfD6wbyOyzgTOTK+m4zSOjzdwnOT4iFzHcZwc4UbfcRwnR7jRdxzHyRFu9B3HcXKEG33HcZwckWhEruMs63gEkJMX3Og7TkZ4OgmnNeDuHcdxnBzhRt9xHCdHuNF3HMfJEe7Td5wWRLX9At4h7ZTDjb7jOPXI4sHhndotFzf6juO0SPzBsXRwn77jOE6OcKPvOI6TI9y94zjOMol3apfGW/qO4zg5wo2+4zhOjnD3juM4TiMsixFE3tJ3HMfJEd7SdxzHWYpk8baQ5RuHt/Qdx3FyhBt9x3GcHOFG33EcJ0e40Xccx8kRZY2+pF6SnpD0hqTXJR0by7tKekTSW/G7SyyXpEskTZY0XtJmRbIOidu/JemQpXdYjuM4TimStPQXACea2SbA1sDRkjYBTgEeM7P1gcfiMsCuwPrxcwRwBYSHBDAC2ArYEhhReFA4juM4zUNZo29mU83slfh7LjAR6AHsCVwfN7se2Cv+3hO4wQIvAKtIWgMYBjxiZrPMbDbwCLBLlgfjOI7jNE0qn76k3sAA4EVgdTObGld9Aqwef/cAPiyqNiWWNVbuOI7jNBOJjb6klYB/AMeZ2efF68zMAMtCIUlHSBotafSMGTOyEOk4juNEEhl9Se0IBv9mM/tnLJ4W3TbE7+mx/COgV1H1nrGssfJ6mNlVZjbQzAZ27949zbE4juM4ZUgSvSPgGmCimV1QtOoeoBCBcwhwd1H5wTGKZ2tgTnQDPQQMldQlduAOjWWO4zhOM5Ek984g4CBggqRxsex3wNnA7ZIOB94HfhzXPQB8H5gMfAEcBmBmsyT9CXg5bne6mc3K4iAcx3GcZJQ1+mb2LKBGVg8psb0BRzciaxQwKo2CjuM4Tnb4iFzHcZwc4UbfcRwnR7jRdxzHyRFu9B3HcXKEG33HcZwc4UbfcRwnR7jRdxzHyRFu9B3HcXKEG33HcZwc4UbfcRwnR7jRdxzHyRFu9B3HcXKEG33HcZwc4UbfcRwnR7jRdxzHyRFu9B3HcXKEG33HcZwc4UbfcRwnR7jRdxzHyRFu9B3HcXKEG33HcZwc4UbfcRwnR7jRdxzHyRFu9B3HcXKEG33HcZwc4UbfcRwnR5Q1+pJGSZou6bWisq6SHpH0VvzuEssl6RJJkyWNl7RZUZ1D4vZvSTpk6RyO4ziO0xRJWvrXAbs0KDsFeMzM1gcei8sAuwLrx88RwBUQHhLACGArYEtgROFB4TiO4zQfZY2+mT0NzGpQvCdwffx9PbBXUfkNFngBWEXSGsAw4BEzm2Vms4FHWPJB4jiO4yxlKvXpr25mU+PvT4DV4+8ewIdF202JZY2VO47jOM1I1R25ZmaAZaALAJKOkDRa0ugZM2ZkJdZxHMehcqM/LbptiN/TY/lHQK+i7XrGssbKl8DMrjKzgWY2sHv37hWq5ziO45SiUqN/D1CIwDkEuLuo/OAYxbM1MCe6gR4ChkrqEjtwh8Yyx3EcpxlZrtwGkm4BdgRWlTSFEIVzNnC7pMOB94Efx80fAL4PTAa+AA4DMLNZkv4EvBy3O93MGnYOO47jOEuZskbfzIY3smpIiW0NOLoROaOAUam0cxzHcTLFR+Q6juPkCDf6juM4OcKNvuM4To5wo+84jpMj3Og7juPkCDf6juM4OcKNvuM4To5wo+84jpMj3Og7juPkCDf6juM4OcKNvuM4To5wo+84jpMj3Og7juPkCDf6juM4OcKNvuM4To5wo+84jpMj3Og7juPkCDf6juM4OcKNvuM4To5wo+84jpMj3Og7juPkCDf6juM4OcKNvuM4To5wo+84jpMj3Og7juPkCDf6juM4OaLZjb6kXSRNkjRZ0inNvX/HcZw806xGX1Jb4HJgV2ATYLikTZpTB8dxnDzT3C39LYHJZvaOmX0N3Ars2cw6OI7j5BaZWfPtTNoH2MXMfhaXDwK2MrNfFW1zBHBEXNwQmFRG7KrAp1WoVW39ZUlGS9ChpchoCTq0FBktQYeWIqMl6JBExtpm1r3UiuWq3HHmmNlVwFVJt5c02swGVrq/ausvSzJagg4tRUZL0KGlyGgJOrQUGS1Bh2plNLd75yOgV9Fyz1jmOI7jNAPNbfRfBtaXtI6k5YH9gXuaWQfHcZzc0qzuHTNbIOlXwENAW2CUmb1epdjErqClVH9ZktESdGgpMlqCDi1FRkvQoaXIaAk6VCWjWTtyHcdxnNriI3Idx3FyhBt9x3GcHOFG33EcJ0e0uDj9cki6FliiI8LMfppSzmpAh6L6H6So2wE4HOjTQEZaHXYrIeP0FPXPp8rOcEnfIaTEKNbhhhT1qz4XktYHziqhx7cT1P1hU+vN7J9J9YjyKr4vYv0uwPoNZDydUkZV16RaGdVcjyxlZIGkFYEvzWyRpA2AjYAHzeybhPVXAQ4GelNkL83smJR6VHVfSPo1cJOZzU6z31K0OqMP3Ff0uwOwN/Bx0sqS9gDOB9YEpgNrAxMJRispNwJvAsOA04EDoozESLoS6AjsBFwN7AO8lEZG3OdVkpYDrgVuMbM5KXQYAexI+GM+QMiJ9CyQxsBUfS4Iuo8ALiScj8NI/hb6gybWGZDI6GdxX0j6GXAsYfzJOGBr4D/AzilkVH1NMpBRzfXITIakrYFLgY2B5QkRf/8zs5VTiHka2C4a3YcJYeP7Ee7TJDwAvABMABal2G8dWdwXwOrAy5JeAUYBD1mlUThm1qo/hBvp+RTbvwp0A8bG5Z2Aa1Lus1B3fPxuB7yQUsb4Bt8rAc9UeA42BM4G3gf+DuyUsN6EeP5ejcurA4/U4FyMKejTsKwZ76Ms7osJhIbIuLi8EfDPCmRUe02qkpHF9chIxmhgPWAsweAfBpyVUsYr8fvXwG/j73Fp61d5b1V9X8R6IjSubgUmA/8HrJtWTmts6TdkfWC1FNt/Y2YzJbWR1MbMnpB0Ucp9Fl4NP4uv0Z+k1AHgy/j9haQ1gZnAGillFDKXbhQ/nxKM1wmSjjSz/cvpYOG1d4GklQkt3F5l6jQki3PxlaQ2wFtxHMdHhIdgWSQdaGY3STqh1HozuyChDlncF/PNbL4kJLU3szclbZhSRhbXpFoZFV+PjGVgZpMltTWzhcC1ksYCp6YQIUnbEFr2h8eytinq3yjp5wQPw1dFes1KISOL+wIzM0mfEP5jC4AuwJ2SHjGz3yaV0+qMvqS51PfpfwKcnELEZ5JWIrz23SxpOvC/lGpcFV8X/x9hRPFKwB9Tyrgv+gv/DLxCOKar0wiQdCGwO/A48H9mVnAPnSOpXKI6gNFRh78BY4B5hNfONGRxLo4luLqOAf5EeO09JGHdFeN3p5T7bEgW98WUeD7/BTwiaTbh7SsNWVyTamVUcz2ylPFFHLk/TtK5wFTSu5mOIzwk7jKz1yV9G3giRf2vCf/R37PY7hiQpm+i6vtC0rGEvoVPCXbiJDP7pvBgBRIb/VY1OEuSgF6WsnOtgYwVgfmEV6UDgM7AzWY2MxstK9KpPdDB0vnjBfwBuMDMljBOkjqnlNcbWNnMxiet05KQ1MvMPmxQ9i0z+yRh/UzvC0k7RBn/tpBGvBIZvanymrTm6yppbcJbSjvgeML5/IuZTa5AVkcz+6KCeu8AW5pZtVkxC/Iqui8knUYI2ljiYSFpYzNL3I/Wqow+gKQJZrZpjXVoD/yIJXv0y0beSNrZzB5vLOrEUkSbVHouJG0UXzE3a0SHVxLIqNqtIukiMztO0r2Ujsjao5yMIlnfAHcChxf+3JJeMbOSx5glklY2s88ldS21PokrIItr0kBeD0JndPH92WS0SBbXI8trmgXRtXMNsJKZrSWpH3Ckmf0yYf2Hgb0qeWAUybgEuNXMnq9URpGsqqLLoBW6d4BXJG1hZi+nqSTpWTMbXMI9JIK7LE1EwN3AHMKr81dltm3IDgR3TKmok8TRJpGKzgVwAmHOgvMb0SFJVEEWbpUb4/d5Vcgo8BrwDPCspH3N7G3CtW2SjO6LvxPcbGOiDDX4TuIKyOKaACDpHEKEyhvAwiIZ5UIEs7geVcuQdLuZ/VjSBEo/OPqmEHcRofPznlj3VUnbp6j/P4J76Qnq+/TThGyOAf4Q/fh3ER4Ao1PUR9IPgAuoLuowyGqFLf03CT367xMuSOHPmeZGqFaH18zsO821vyb0qPm5aCkUWvWSBhF82ScDpzVHS7+MXrJm/pPF/py+Zpa2QdIikLSGmU2N7p0lKOXiaELWi2a2laSxZjYglr1qZv0S1i/ZD2Fm1yfVoUhWV4KHYH9gLTNbP0XdVwkP/kfNbICknYADzezwMlWXoDW29IdVWlEh0uV1M9uoSh2el7SpmU2oQpeKXURFVHwuivTYtoQOaWLCLylRPAcYbWZ3J5QxCBjJYndE4eGVprNMhErPSRoC3E6IaEpWWbrRzA4qV1ZGxulm9sei5TaElm/SmPBCvaquCfAOwQ9ekdHP4npUI8PMpsbv9yV9izDNqgEvJ+2jKeLDeD5NUjtCB3Ni/3clxr0J1iPck4VWehqyiC4DWqHRjzfCYGB9M7tWUncShoKZ2UJJkyStVU1nMDAYOFTSu4Q/ViUt7GpcRMDiFk9DP19SJN0IrEsYMFLsBkhjYDoQbuQ74vKPgHeBfpJ2MrPjEsi4htBRN6ZIj7R8v/AjthJ3ArZNUb/ea7LCgLfNU+rQS9KpZnZWfKjfTogxT0xG1+QLgkviMSpzSWRxPaqWoTCo6Y8Ed6iAS+ODdVQKMb8ALgZ6EMJGHwYS+fOjDu9S2sWU5gF4LmEQ6dvAbcCfzOyzpPUjWUSXBX1aoXtnBDAQ2NDMNlCIcb/DzAYlrP80MIAw+rXupKXsNMzitbNqF5EaGUVqZon8fJImAptU436Q9AIwyEIcdcFYPkN4ME4ws00SyHjRzLaqVIciOanTWkg6FfgdsALBWEIwMF8DV5lZ4phwSQJuJgzG2Ql4wMwuSnEIWV2TqlwSWVyPjGRMAra1GEElqRthIGbiGHdJg8zsuXJlTdTvVrTYAdgX6Fr8RpdAxpHAP6qJAFKG0WWt0eiPIxjtV4p8dOOTtrIVQqaWwMyeSlC36iiNIllXAZdW6SKqys8n6Q7gmMLrdIU6TCKEtM2Jy52Bl8xsw2I/ahkZZxMGzPyT+i3TxBEraiStRYpzcVYaA9+gbnG/QTvgr8BzhNZu2uOo+ppUS0bXIwsZzwM7WgxtVIjZf9LMEr/BqUQEV6myNEgaY2ap3gJjA63QgfyUmd1b6f6rpdW5d4CvzcwkhXHJ4QmYmCTGvQlKRWnUiSbdgI0sXEQV+fm0OJyuE/CGpJeo/8dME1Z3LsGV8GQ8hu2B/4vX5dGEMgotwuKJnlNFrBBahH1jA+A0hWR0DyatbGanqvKkWA0jbmYT8t6cT/rjWJUKr4myi3rJ4npULEOLw4AnAy9KujvW3RNINN5AIVRzW6C76ocVr0yKEbkNHuhtCMeTym5KOovQL3FzLDpG0jZm9rsUMoqjy5YnNC7S5iECWqfRv13SX4FVFIZH/5QQrZEIVZHEycx2j9/rVKJ4A3bNQEalfr4sQiQLHZUTCX+uLWPx78yskADvpCRyzGynDNSZH78rSmuhKpJiZaR/gZFV1D02fu9ejQJZHE+VMgphwG/HT4FEgQGR5Ql9fctRP6z4c8JbYFKKH+gLgPeAH6eoD7Ab0N/MFgFIup7Q15PY6JtZ3TFEN+KehHs0Na3OvQMg6XvAUELL8iEzeyRF3dGEkKk7CE/tg4ENUvpu7wFuAe626gZt9AO2i4vPmNmrKetX5eeTtA4w1czmx+UVgNXN7L0UOiRy4SSQU22a6f9HeJgPAS4ntIr+ltT3GlvHWxCSxfWXtBEhtUWTqZtLyKnqOKKMtQmBCo9K6gi0NbO5KeqfY2YnlysrI6Pa67E6ISHYmma2q6RNgG3M7JqkMrJA0tpp+tqWkg7jCW6qWXG5K8FNVVVodaX/vVbT0lfIz30eIbJhAvAbM/uoEllWfRKn8wmDX86W9DIh6919BeOZBIVcGj9n8WCsmyRdZWaXJpVh9dMvVBJadgf1I1wWxrItUsh4TNKPCFkDK2pBNOaPT1G/DfBYjIj4h6T7SJnWggySYlV7HFHGzwmDtLoS7vUewJWEh1lSvseS+ah2LVHWmA5ZpP2+jpBe+fdx+b+EyJXERl/SwFi/4cjiNMbyC0l/ZskHWJNvcMoukR+EeQXGKgzwKrhAT0lRH9UfwV9wMyW2N8W0GqNPyCF9A8GV8QNCqy5VKyxSdRKn2C/wlELc/84E4z2K4C9MyuHAVgXDrTCK8j+E40qElhxFCjFGHjjRzN4pI2I5K8r/YWZfx3OThiMJo0kXSirchJbS11itP36RpMsJHfxYGJSUNgw2i2RpVR1H5GiCq+xFADN7SyEktyySjiKEI347ti4LdCJ0LCcli+NY1cxuV4iOwswWSEobunkzwUVYcS77KOM2gsvrF4SkbzMS1MsqkR9mdkvs8yo0pk629OMNikfwF9xMe1aiT2sy+p3MrOC7n6QwmUAlHETw4/+KEEfcixBbnoroCvkBocW/Gelb2qJ+/PJCKJ82oAEXAVMIHcwiuK3WJWTtHEWYSKMpZkjaw8zuAZC0JyGLX2KKfY1VkEWa6areOMxs7/hzZGyRdQb+nVJMFsfxVXz4AnUhsEmP5+8E43wW9VuSc9NElpHNcfxPIdyxEHCxNaFBkoYZhXuzCrqZ2TWSji1qrJVNW2Jmf43fp1W6Yy2ZR2lK/F5T0pppIpnM7LBK9WhIazL6HSQNYLFhXKF4OekJLPLvfQlUdEEl3U5ojf0buIwQgpW2JXItITLhrri8FylefSN7WP3h5FdJGmdmJ0tK0kn0C0IHcMEHPoXQx5EK1Q9He9LM7mtq+xJUnWaaxW8cC+IbR+qcSiX6WNJmxyx1HImDDCJPxWu3Quy7+iWQKLwvurPmAMPjW+jqhP/4SpJWsuQDErO4HicS8t2sK+k5oDvpOlABRki6Gmg4yCxNfqrCfA9TYz/FxwTXWSJU3YjzQidwB4I75lXCfdmX8Da+TTPpUV9Wa+nIja2vxrByProiOaXC2QoukTOSdIJKGkaIja90tGJBzmaE0E0IRibt6M3/EKajuzMW7QOcYGZbR+PfP6GclQDMbF6a/ce6ZxNeWwvhaMMJN2KlMe+p00xnQYk+lr0Jg7MSu9sayKvoOGL/xOEUBSoAV6d5e1GYtGQkMI3FbhGrpOOwmusR31I2JBzHJEs4L21R/ZsIo71fp/5x/DSFjN0JgwV7EVynKwMjLWGcvMJ4mlIjzrsB71iCEeeS/gmMsDgmR2GyoZFmlvghmIUedbJai9HPiujHX0h4FYbgEulImIxlsJk1NedqQUY74CiKBlsAVya5qdXIwK4CaV7DFSaEuJjQYjDCXJ7HE4abb25mz5apX3WERfQdF4ejtSVMOZjYwMQ6u7FkvplEnWXRuCw0M5PUixAjPtnMxqU8jm2K+lhWBP6T8jjGEzr1b7OQ5bMmSJpM6C+qdC6Akm97li4nU9XnQtIkSzH6thEZ1Y7IzWLE+evWYJR8qbKlrUeB1uTeyYrvWv3ReBO0OEPjgQllXEEYHPGXuHxQLPtZgrrFA7vWIgzkEbAK8AGQeAxA7Kht7CHVpMGPXEeVERaRVYDCw6pzyroQ3BfzqaDDLka7nAPMk/QnQsffK8AASaPM7Jykoqi+j6XQx3O7pEWEc3l7CrdKoWX6J5ZMVJamY/xD0vvPiymO3upAiBx6hXT5f6o+F4TEhpuY2Rsp6jTkUkKfW7myxuhCiPcvnM8VCWkYFkpKGiwwPrqpborLB5BwkFnGegD5NPptJW1pcWpBSVuweITegoQytmjgS39cISVCWSwO7JL0N8IUbg/E5V0Jfv3EKCSb+zlLtpCTvv5mEWFRdTga0LMS10PkOELndSfCQLG1zexThfj2lwkPhCRU3ccS+4vOBc5VCDH+f3H/aeZkvYgQlTahkg7pyDvAk5Lup74vPNGbk5n9ung5+vdvTaNARudia0KUXepR68poRC7ZjDg/jOAZKAyee5rQSExDFnoA+TT6PwNGRT+2CCP0fhZP3lkJZSyUtG7htTW6WdIay63N7OeFBTN7MLqe0nA34RXv0Qr2DxlEWGQUjvagpKFm9nDKehDScswGZkuabDGplZl9ISlRR2z0o78APMniPpbD0vaxRFlrE1q4+xGuSeK5SyMfAq9VYfAhvDF+QBiVmjYEtxT/I8UbaIEMzsUuafdZRCYjcmPkzwNUN+J8PqHv7cKk+10aehTInU+/gEJisELEQ9q6Qwgtw3cID461CUaiqc7mhjIeIhjs4le+7c0scY78NJ21jdTfHLgE+A5h5qnuwL6WYmSwpMJo4G+b2emS1gK+ZYsnaU8iY2/CeWhDiLZI7NJQmEhmeKx7E/CTWF/ATWa2cUIdxlqVI4slvUhw+91B8GWXGydRSsYWBPfOU1TQSm8gq9J5YYunOmxDyCN0u5klfoPL4lxEOdWOWl/bFqcg7wJ8luSBqkamrSxgyaYULZkDqUhGqrdbVZ4bqr6cvBh9NTKyrkDaP1WMaih0Mk2ylLMUxQ7dEYTXtMJUdqen7Mg9g5Bq9oE0+24go9oIiysIfvidzWzjeGM+bGaJR/XG1/c9qcClUSaqK3EOGEnnEQbHVTOyeEMzm1RJ3SIZDwPzaNC/YSnixVX9vLDFmWgXAO+b2ZTGtm9ERhbnouKIKkl/JDyo3oz/1QeB/oTj+YmZNekSKXNfmSWIFlQjKdiLhKRJxV4yN1QSPZaQlSOjP6Kp9Sn/VKVGAs8hGK3pKfVa0eqnU0hTdy6hQ+crUraQY/23gT+b2ZVFZfdZTCyXUEahE7yupawU09HF7Z8m5CapdNRl1RSdywUszmeUqgM1vj0WHuQQWuunp3mbVDbzLLxIcGHcU3RNmnWKz4zORcURVZJeB75jZibpCMIb4BBgA+B6M9uySQEZoxApV2gIvVSBncgkNxS0Up++Qm6UVC3rNEY9AYcTwiQLrYEdCVE56yjM7HNjYxULKEzhdjXB75i6NQaZjIb9BthJ0lZx318Tcr2kkqEQclnoF+hO+iHzhY7HB6nSpVEpGZxLCKOgX2NxFsaDCG7ANH/MB6ro36jDzD6U6gUfJe7ziY2ac4DVWOwqS/UAJJtzUU1E1ddFb2zDgFsshDtOjG+3yRQIAQEnEOa0PSJ2Sm9oKQYgSvoxYaDbk1H/SyWdZGZ3NlmxPlXnhirQKo0+cJikp62CUC5J11I613jiAR+E87axmU2LMlcnhLNtRXDTlDX6hE6dYYRRi5jZq5K2b7pKfRrbPoWf7wsz20/Sb4FnJO1L8iH/BS4B7gJWk3QmoYX5h5Qy3o2frDoeE6Mw0K5Twz+gQkqHzy1FBldgXTMrTulxmsKkP2k4CviNQhhe6re3SFXzwhIiRX5gZmnncS0mi3NRTUTVVwqDoKYBOwG/KVrXMaUOY1icmPAjQj9FmlHnvydE/E2HuobRoyweVJmELHJDAa3X6K8I3Bo7VIcQBhclbRUWX6wOBD/hx41s2xi9CgY/Mj2WzZKU2CdeTWssUtxr34HQsz+G5JNdFFJYnKuQy+hhUgxRj3VvljSGcB0E7JXWWBTewlTFyOAq+COlQ2WfIowfSGP0v5Q02OKgOIXJwb8sU6eOGEW0iyUcONQEpeaFPTpF/WlVGnyo8lxAeNNTiAyrJKLqWIJR7Q5caGbvRj2+T7p5i9eNDaPhUacv1OBPm4A2Ddw5M0mf5DGL3FBAKzT60Ve4CSHiZCohBKt/0vpm9o8G8m4h2UCmYp5USN9bGBK9TyxbEfgsoYxqW2NYg9HDCqNRL0ohoi7XvIXc7UOBQ9PoEOu+GX3ibaMey5lZ0jEPxBbZjcQHjqRPgYPN7PW0ulRAezNbIuuihVj/VLOyEVrp18d7VIQBa4cmrWwhW+hlxGyhlWIhbPWAtPWK+qpGS7qN0KqsNOdNxedC9Uetvxc/deuSBDuY2YuEtAUNyx8A0gQ+fK2QXLHgvlyX9Blc/60QrXdLXN4vjQ7Rffq6mW0EdVl+K6bVdOTGA/8tYZBHe+JrL2Fm+ZFVyN0QuN/M1ktRRwTfZKEF8hxh4uM0+VFWJbTGvks4loeBY63CofNFer1uZYZkN+HS2AeYk8SloTCgq53FiTUkvU/ozF6e0FGWdMwDCnOh/t5iyKukHQmdVInnQq0USf8lTES+oEF5O+ANM1u/ApkrA5jZ5xXUzSKKqKLkXNH12RiW0gVakJn6XChEcxVGrZfSI820pFWhkPTuD4SG5sPAIOBQM3sypZwfxboQQk/vamr7EvXvBn5t6UY0l5bVWox+geg/P5fQMQThVamfLR6oUK5+wxz0nwCnNnwDSCCnqtmNskDSpdSPp+4PvGdmTaaTUMh6uFfDFm58EN1rZmWz/0V30HZFkRVjLUzO3paQdXRw0xLqyVoi2qdU2dJAIWHc6sCvio5lJcID+VNLMNuUMgwHVjZRRFUl51IV+WqyPBctBYUBjFsTrsUL8U2quXV4mvAG+BJFU6JauvmsgVbo3jGzaQqhXJMIHaHfI7zK/jlh/aqjNJTB7EYK82Qea2G2p8LAi/NTtqZGF/1eQIhQSOIPzsSlYfVDTS+OZQvj63Aa3lGY7rDQAX4gIaInFaogqovQijsDeD++rUDIiXQN4a0yCVlE/gCZRRH1pX5yrisoSs6VoH41+WoyOxcAqj5tdzX7LiTymxltzlaEbJ2JjH6JBmbdKtJ3zie9F8vS6lr6AJKOBp619KPzqs7GGOWMI85uZIvjoCeY2aYpZIy1BiNAS5UtDbJwaUQZfazBYC6FgTCvpXGLxAfeaSx2lz1NSD37WVIZUc4vgEqjulYACi6+yWaWqtNxaRD9x8OB4ZYuI+MkYEuLMfHRr/6SmW3Y1D2mxflqjqN+yoCVgb2b482rgT6l0na/bGZl54pQ6bE0dZTrn1BRIj/CCOm6RH5AokR+kv4FfIswuOw2q/FcvXWYWav7AB0rqPNzQmfSB/H3fwlJpCYR8sWkkfVi/B4bv5cDxqeU8SrQpWi5K2FwV5K6EwhZ+hp+JiTRAzibEIq2YlHZSoTW7TkJdfg/Qix2x6KyFaPcs6q8vhsSJjVPW+/EeB66E1Jmn9BM9+PPCa4+CK24UQQf+nhgQEpZaxLiwl8muHdGAJumlHE4wZ1zLSGT6juEnFMrEgbjNVZvh7i/qfG78DmhcHzNfC7GEyJfCsttk/7P4rFfC9xPyGT7j/iZRZjPulz91wmZLdciuFNWjeUdCf1mSY+hMyHh2kOEiLBfErJjpr3Hto73xDzga0Kk3+cV3a9L88/Qkj5ZXcRY51zgd8CbBPfSXcCZKWUcHOv/ieBeeBM4KGHdteOndzyutYs/CeovRzD8nxJCPMcQ5g09m9A5m0SHtk3IWC6hjL6EzrHX4nlYI/4xpwDHpzyfnQkPrUUEV9cswoCa5ri3XiucN8LIzzEE//l3CZ12SWQcQRjs9994P/QF3q1AlzaE1voahNQWexJCmtPIKHsPLc1zUSRrfLGBJDSM0jauHgbWKFpeA3goQb2xRb9fbWxdyuvyk/h/Sd0YIbhy1yOEm7YlPEgqalxl/gdoqZ8sL2K8gD8ndJTdCfy8Qp36EObq/RXB3VKJjFeqOCcrAJvGzwrNLYMw+fehhJb9cYSBNOcSZmpKKqMtcCrwBaH1syh+j2zGe2tc0e+/E/pqUl0fQuvtKWBgUdk7FeoztsJ6F8XvewmDBut9mutcFG0/nDAA6TrCHNTvAvullDGxwXKbhmWN1HuT4MrZnBBKPYDQp7F5kvpFcrYl9IeMI0ytul2F12Z0/B5fVFbRdW6VPv1KUEbZGKOsg4B/WVG0jqTdrYJOJkmrUT9rXqqQLMXcN2n32xJQgyyhkt6xCsPxqo3qqoYYybQbwY3wPiH53Otx3cQk91aMENmXcI9+C7idEBrYqwJ9Kgr7lLS5mY1R/YRrdViC+PAszkUDeWtQP2dNqrTdcdzD+tSPkZ9sDeYMKFHviabWW4JEfpLeI4zbuRV4nAbzdViKidFj9M53CalbPiG44A61CvpZ8mT0q76IRbI+IwwYGW5x5GJa4xujEs4n+HCnE1wzEy1Bh53qp329mcUPMCDdzVRLih7EBd3rHUva45B0IsF3XYjq6mpmiaK6qkFhtqu/Et467rU4T0I0nr81s91SyutJME7DCX74uyxB52VR/arDPkvIvM3M9kuwXdXnIotxJA3q7c3iCKCnLWWMfKUojCYuGFij/rgDsxQZMhVSlk8njIM5nuDO/IuZTU6tV16MfpZIGkvoLLuR4Ea4I23kjcJMWzsTJlgfIGkn4EAzOzxB3aYeYKluplqS9XFUGtWVBTEyrJOFCV0KZSsS/mMVp5WQtAGwv8VBcLVC0gdmtlbCbas6F1mMI2lQb21qPKamUhQSvJ1HCA2fAPzGzD6qSqYb/fRocTrhVQmvja8CQy3dJNqjzWxgNP4DLAzBb5YBScsqqnDSkKL6lcT5t2gqDfssISex0a+Wwn+jkXXjU/7P6sbUmNm60YheaWaJx9TUEknPEJI5Pg3sQUg1nTqdcjGtbnBWC2Eq1A1mGkaI502bq/yzOPLzaeBmSdMpGmnnpKcagx85TBVmb21JSFqTELI6nNDBflZcLlevMfekCLNgNRcrq0T+pjiOJO3Av6OJY2oAzOyt2I/WWuhkZn+Lv/8c+0yqwo1+ShTSDNQlRrMw8cdJpJynkhBKN5/gnzuA4KOr6Su8U1X21pqjMFnIcMII8dsJLsi7LflcEuc3se7NKtVLwz+Bv0kqlRojTdI3gK/M7GvFxJjR9dSa3BsdJA1gcX/ACsXLlfTfuXunAiT9J61fsaWShUtjWXCLxFGrFxDinxcRs7emjaaqJQoTwf8HONHMRseyiiOiakU0zGcQOuWXSI1hKab0lHQuIYLmYODXhMFRb5jZ77PUeWmxNPrv3OhXgEIukx6EOP3i5EdlWyEl8nEUsoVWHWFRCdWkLshSRq3QUsreWguyDPtsCWSRGkNhjoLDgaGEa/tQkbskl7jRrwCVTkFrliBZWoN8HLfWuiUZwxwPIbgzKnJpZCGj1tQyzn9pUG3Y57KCpGPN7OJyZWVktPo32WJyafRrfRGjK+GHhM61DsBthAdA2ckhloIeVbk0lgW3SIEs4vxborusVmGftf6fRR2WGD9TQXh1Fm/DNT8XBVJN2bUMcZikJicaaQpJG0h6TNJrcbmvpMTzwprZHDO7FtiVMJDldCqYsSruu30FddoqTIIytWi/Ai5JaqyzkNFAXurjWArMJxjHD8zsmgoHdlV1b2Uoow4z+2+N4vwzPY40SBou6V5gHUn3FH2eIORlSkOhg7+7pP1VZs6ARqjZuVgCqyB3Q2v/UGU2RkKOlC2pn8/ntRT1M8nHEWX9gsrz9qxOyGmyKH5mkD45V9Uyqj2ODO+L1Nlbs763spLREj61PA7CCPcdCR3bOxR9NiNhQsAoJ5NEfi3pmuYuZFNVzrEb6WhmL6n+/MiJ5oRV/XwcRxTqFWKkLX0IVsVhhlblhDRZyYjUPFzSqozzz+Leyuj+rDm1Pg4LuevfByqKsivRwQ8VvsnW+lwsQa1bA8345M8sGyPwIGFY9CtxeR/gwYR1nySk0H2CkITpieLllHpU3QohDF7pV+W5rUpGFsfR2u+tLO/PjI6pfa3ORUb6Pxu/5xKMbOEzlxR56KniTbalnIsl9KrlzmtywNm4NL4NPBov5kfAs0DvZjyGLB9gWbg0KpLRUv8UNb63MnGXZXAsVbnbsjqOSh8+GZ+LEwkDMtcihH+e1JrPRe46cs1sGsG3NokwCckpBHdEGhnvmNl3Cf65jcxssJm9l7GqTe1/oZmdBaxDSBMN4dXz6DgEP42salMXVCwjy+NoCWR0b1UtIyOq6rzM8Diq7gCNQQdrSlqr8EkpoqoO/pZ0LiC/aRjqLiLBrZCIxm78gm/fmt8HnZU/vaYsK8cRqejeylpGNSGCGfqgszgXVfX1SPo1YcrHaYRWNoTBd4mTtgHXZtA4qvm5KJBXo1/pRexU9PtIQrhlrcniZmoJLCvHkYWByEJG6uRxWXZeRqo6jowePscCG5rZzEr1yOJtmJZxLoKs6CtyUpJ2gEcjMnoQQsvqHr5m9nRKGVWlE24pLCvH0VKoZpR0rUcnZ5kaI8blf88aZOxsLSyNNCFu9Cuk1Ei/lPXPIQyTf4PQeQkhlcMeWejn5JeMRlrXZBayBjpU/fCRdA1hDub7gTp3V3O7YqslywdxXt07LYG9CK+dLWJottP6ydg9U3N3W0Z9PR/Ez/Lx0yrJst/LW/opkDSBxRky1wMK81MWMmSmmdHnQWBfq2IqPccpRUYt5BbhblMNp8BsaWR1Ltzop0Bhrs1GsTAKsJyMSwkPjh5AP+Ax6r92HlOlmo7TItwzWVDpwyfm3Sk2bgZ8CjxhZjeVrtWyyepB7Ea/mZF0SBOrzcxuaDZlnGWWvLeQJe1QorgrcCDwlpmd0swqtRjc6NeILPJ8O05jtBT3TEsj9nuMMbP+tdalVuRuRG4LolSL/9DmVsJZNnGDXxozW1h+q2Ubj95pZiQNB34CfFvSPUWrOpE+z7fjOCWQ1LVEcRfCXLmvN7M6LQo3+iloEL2zBAmjd54njKhbFTi/qHwuIT+H4zjVM4bFc0/D4o7cJ4GjaqRTi8CNfjp2j99Hx+8b43fi5Elm9r6kKcB8M3sqS+UcxwmY2Tq11qGl4h25FVAqBUPaEbqSHgN+aGZzMlfQcRynEbylXxmSNMjMnosL25K+U3weMEHSI8D/CoUep+84ztLEjX5lHA6MijlOBMwGfppSxj/jx3Ecp9lw904VRKNPpS4aScsDG8TFSWb2TVa6OY7jlMKNfgVIag/8iDALTnFa5NNTyNiRMIXae4S3hV7AIWlTKzuO46TB3TuVcTcwhxAWVmmWzPOBoWY2CUDSBsAtwOaZaOg4jlMCN/qV0dPMdqlSRruCwQcws/9KalelTMdxGlDN1JHLIp6GoTKel7RplTJGS7pa0o7x8zdgdBbKOY5Tj0wmFF9WcJ9+BUh6g5BP/12Ce6eSfPrtCYO8BseiZ4C/eIvEcbKlmqkjl0Xc6FdAY3n1k+TTdxyn+chi6shlDffpV0DBuEtaDeiQpm6cqLmxJ62Z2ZAq1XOc3JPx1JHLFN7SrwBJexCib9YEpgNrAxPNrE+CuqWic7Ym3KDTzWyLLHV1nDyT5YTiywrekVsZfyIY6v/GxE5DgBeSVDSzMYUPsBJwDjAc+IUbfMfJFjObRsheO4kwruYUUiRIXBZx905lfGNmMyW1kdTGzJ6QdFHSypKGAX8gdAKfaWZPLC1FHcdhPrB/dOtcU2tlao0b/cr4TNJKwNPAzZKmU5Q0rSkkvQx0B/4M/CeW1WXnNLNXslfXcXLNtT6T2GLcp18BklYEviS4xw4AOgM3m9nMBHWfZHFHbvEkDxA6cnfOVlvHcZzFuNF3HMfJEd6R6ziOkyPc6DuO4+QIN/qO4zg5wqN3KkDS+sBZwCYUjcg1s2/XTCnHcZwEeEu/Mq4FrgAWADsBNwA31VQjx3GcBHj0TgVIGmNmm0uaYGabFpfVWjfHcZymcPdOZXwlqQ3wlqRfAR8RUiqkwid3cBynuXH3TmUcC3QEjiFMb3gQIV93WnxyB8dxmhV379QQn9zBcZzmxt07KZB0kZkdJ+leSuTEN7M9UsjqTIj++Q4wlTi5Q0aqOo7jlMSNfjpujN/nVSrAJ3dwHKeWuHunRvjkDo7j1ALvyK0ASYMkPSLpv5LekfSupHfSyPDJHRzHqQXe0q8ASW8CxwNjgIWF8iSplRvIORp41sxezVZDx3Gc0rjRrwBJL5rZVhnI6eiTOziO05y40a8ASWcDbYF/EqY8BHzWK8dxWj5u9CtAUqk5bX3WK8dxWjxu9B3HcXKER+9UgKRjJa2swNWSXpE0tNZ6OY7jlMONfmX81Mw+B4YC3Qix9mfXViXHcZzyuNGvDMXv7wM3mNnrRWWO4zgtFjf6lTFG0sMEo/+QpE7Aohrr5DiOUxbvyK2AmEu/P/COmX0mqRvQw8zG11Yzx3GcpvGEaxVgZosk9QR+IgngKTO7t8ZqOY7jlMVb+hUQB2dtAdwci4YDL5vZ72qnleM4Tnnc6FeApPFAfzNbFJfbAmPNrG9tNXMcx2ka78itnFWKfneulRKO4zhpcJ9+ZZwFjI3pGARsT0iN7DiO06Jx906FSFqD4NcHeMnMPqmlPo7jOElw905KJC0nSWY2FRgLLA+sUWO1HMdxEuFGPwWSfg5MB96Pvx8D9gFulXRyTZVzHMdJgLt3UiDpdWAw0AmYCKxtZp9K6kgI2exTUwUdx3HK4B256fjazGYDsyVNNrNPAczsC0lf11g3x3GcsrjRT8cKkgYQ3GLLx9+Knw411cxxHCcB7t5JQSMzZtVhZjs1ly6O4ziV4EbfcRwnR3j0juM4To5wo+84jpMj3Og7juPkCDf6juM4OcKNvuM4To5wo+84jpMj3OhXiKT2tdbBcRwnLW70K+cwSZvUWgnHcZw0uNGvnBUJ2TW7S9pf0gm1VshxHKccnnunAiR1BjYBvgNMBT4H+tdSJ8dxnCR4Sz8FktpKOpVg6A8tFAOXmNkHNVPMcRwnIZ57pwIkrQ6cCxwUi2YC/czs49pp5TiOUx5v6VeAmU0DxgOTgN6ESdEPqKVOjuM4SfCWfoVIOhp41sxerbUujuM4SXGjXyGSOprZF7XWw3EcJw1u9B3HcXKE+/Qdx3FyhBt9x3GcHOFG38k9kkzS+UXLv5E0soYqOc5Sw42+48BXwA8lrVprRRxnaeNG33FgAXAVcHzDFZJ+IOlFSWMlPRoH5iFppKTrJT0j6X1JP5R0rqQJkv4tqV3cbnNJT0kaI+khSWs076E5Tn3c6DtO4HLggJhXqZhnga3NbABwK/DbonXrAjsDewA3AU+Y2abAl8Bu0fBfCuxjZpsDo4Azl+5hOE7TeMI1xwHM7HNJNwDHEIx2gZ7AbbGFvjzwbtG6B83sG0kTgLbAv2P5BMJI7Q0JSfkekUTcZurSPA7HKYe39B1nMRcBhxPSZhe4FLgstuCPBDoUrfsKwMwWAd/Y4kEviwgNKgGvm1n/+NnUzIYu5WNwnCZxo+84ETObBdxOMPwFOgMfxd+HpBQ5CeguaRsASe0k9alaUcepAjf6jlOf84HiKJ6RwB2SxgCfphFkZl8D+wDnSHoVGAdsm42ajlMZnobBcRwnR3hL33EcJ0e40Xccx8kRbvQdx3FyhBt9x3GcHOFG33EcJ0e40Xccx8kRbvQdx3FyhBt9x3GcHPH/AS3gt81dOQv/AAAAAElFTkSuQmCC\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"(\n",
" covid_19_status\n",
" .assign(confirmed_cases_per_m=lambda x: pd.to_numeric(x['Per Million'], errors='coerce'))\n",
" .assign(deceased_cases_per_m=lambda x: pd.to_numeric(x['Per Million.1'], errors='coerce'))\n",
" .fillna(0)\n",
" .sort_values(by='deceased_cases_per_m',ascending=False)\n",
" .iloc[:20,:]\n",
" .plot\n",
" .bar(\n",
" x='Name', \n",
" y='deceased_cases_per_m',\n",
" title='Death Cases by COVID-19 by Country')\n",
");"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## More examples\n",
"\n",
"### Using Functions \n",
"\n",
"Functions are a more complicated topic and you are not expected to write functions most of the times. Functions are also available in Excel as Macros. You should be familiar with using functions, read them to understand what they do, and also modifying them to fit your needs.\n",
"\n",
"We often find the a set of commands is repeating itself many times in our analyses. This is the time to wrap these commands as a function. In the next cell, we will create a function that is wrapping a few of the commands that we used in the previous cells such as choosing a specific index of a table from the table list and setting the first row as header.\n",
"* Read the HTML content of the given URL\n",
"* Parse the HTML and extract all the tables in it\n",
"* Take the first (of other given index) table from that list\n",
"* Set the header of the table from the values in the first row\n",
"* Skip the first (header) row\n",
"* Return that table"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"def get_table_from_html(url, attrs=None, index=0):\n",
" response = requests.get(url, headers=header)\n",
" df_list = pd.read_html(response.text, attrs=attrs)\n",
" df = df_list[index]\n",
" df.columns = df.iloc[0]\n",
" df = df[1:]\n",
" return df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Once we have the function defined (_def_), we can try it in a simple way. Let's say that we want to add to our CRM (Customer Relationship Management) system a new target market, as the consumer good companies (CPG). We find in the web a table with the top CPG companies for 2019. "
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"url_top_companies = 'https://consumergoods.com/top-100-consumer-goods-companies-2019'\n",
"df = get_table_from_html(url_top_companies)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's look at the table"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
"
\n",
"
0
\n",
"
Rank/Company
\n",
"
Net Revenue ($M)
\n",
"
1-Year Sales Growth
\n",
"
Key Product Categories
\n",
"
\n",
" \n",
" \n",
"
\n",
"
1
\n",
"
1. Nestle SA*
\n",
"
$92,085
\n",
"
2.10%
\n",
"
Food, Beverage, Confectionery
\n",
"
\n",
"
\n",
"
2
\n",
"
2. Procter & Gamble
\n",
"
$66,832
\n",
"
2.70%
\n",
"
Household Goods, Health & Beauty Aids, OTC Pharma
\n",
"
\n",
"
\n",
"
3
\n",
"
3. PepsiCo
\n",
"
$64,661
\n",
"
1.80%
\n",
"
Food, Beverage
\n",
"
\n",
"
\n",
"
4
\n",
"
4. Unilever N.V.*
\n",
"
$56,188
\n",
"
-5.10%
\n",
"
Household Goods, Food, Health & Beauty Aids
\n",
"
\n",
"
\n",
"
5
\n",
"
5. Anheuser-Busch InBev
\n",
"
$54,619
\n",
"
-3.20%
\n",
"
Wine & Spirits
\n",
"
\n",
"
\n",
"
6
\n",
"
6. Christian Dior*
\n",
"
$51,607
\n",
"
7.20%
\n",
"
Apparel/Footwear/Accessories, Wine & Spirits, Health & Beauty Aids
\n",
"
\n",
"
\n",
"
7
\n",
"
7. LVMH Moët Hennessy Louis Vuitton*
\n",
"
$51,607
\n",
"
9.80%
\n",
"
Apparel/Footwear/Accessories, Wine & Spirits, Health & Beauty Aids
\n",
"
\n",
"
\n",
"
8
\n",
"
8. JBS S.A.*
\n",
"
$44,587
\n",
"
11.30%
\n",
"
Food
\n",
"
\n",
"
\n",
"
9
\n",
"
9. Tyson Foods
\n",
"
$40,052
\n",
"
4.70%
\n",
"
Food
\n",
"
\n",
"
\n",
"
10
\n",
"
10. Nike Inc.
\n",
"
$36,397
\n",
"
6.00%
\n",
"
Apparel/Footwear/Accessories
\n",
"
\n",
"
\n",
"
11
\n",
"
11. Imperial Brands PLC*
\n",
"
$33,641
\n",
"
0.90%
\n",
"
Tobacco
\n",
"
\n",
"
\n",
"
12
\n",
"
12. 3M Co.
\n",
"
$32,765
\n",
"
3.50%
\n",
"
Household Goods
\n",
"
\n",
"
\n",
"
13
\n",
"
13. Coca-Cola Co.
\n",
"
$31,856
\n",
"
-10.00%
\n",
"
Beverage, Food
\n",
"
\n",
"
\n",
"
14
\n",
"
14. L’Oreal*
\n",
"
$29,687
\n",
"
3.50%
\n",
"
Health & Beauty Aids
\n",
"
\n",
"
\n",
"
15
\n",
"
15. Philip Morris International
\n",
"
$29,625
\n",
"
3.10%
\n",
"
Tobacco
\n",
"
\n",
"
\n",
"
16
\n",
"
16. Danone*
\n",
"
$27,168
\n",
"
-0.60%
\n",
"
Food, Beverage
\n",
"
\n",
"
\n",
"
17
\n",
"
17. British American Tobacco PLC*
\n",
"
$26,993
\n",
"
25.20%
\n",
"
Tobacco
\n",
"
\n",
"
\n",
"
18
\n",
"
18. Kraft Heinz
\n",
"
$26,259
\n",
"
0.70%
\n",
"
Food
\n",
"
\n",
"
\n",
"
19
\n",
"
19. Mondelez International
\n",
"
$25,938
\n",
"
0.20%
\n",
"
Food, Confectionary
\n",
"
\n",
"
\n",
"
20
\n",
"
20. Haier Smart Home Co.
\n",
"
$25,755
\n",
"
12.20%
\n",
"
Housewares/Appliances
\n",
"
\n",
"
\n",
"
21
\n",
"
21. Altria Group
\n",
"
$25,364
\n",
"
-0.80%
\n",
"
Tobacco
\n",
"
\n",
"
\n",
"
22
\n",
"
22. Heineken Holding N.V.*
\n",
"
$24,765
\n",
"
3.90%
\n",
"
Wine & Spirits
\n",
"
\n",
"
\n",
"
23
\n",
"
23. Adidas AG*
\n",
"
$24,153
\n",
"
3.30%
\n",
"
Apparel/Footwear/Accessories
\n",
"
\n",
"
\n",
"
24
\n",
"
24. WH Group Ltd.
\n",
"
$22,605
\n",
"
1.00%
\n",
"
Food
\n",
"
\n",
"
\n",
"
25
\n",
"
25. Henkel AG*
\n",
"
$21,931
\n",
"
-0.60%
\n",
"
Household Goods, Health & Beauty Aids
\n",
"
\n",
"
\n",
"
26
\n",
"
26. Whirlpool Corp.
\n",
"
$21,037
\n",
"
-1.00%
\n",
"
Housewares/Appliances
\n",
"
\n",
"
\n",
"
27
\n",
"
27. Japan Tobacco*
\n",
"
$20,565
\n",
"
3.60%
\n",
"
Tobacco
\n",
"
\n",
"
\n",
"
28
\n",
"
28. Fonterra Cooperative Group
\n",
"
$20,438
\n",
"
6.30%
\n",
"
Food, Beverage
\n",
"
\n",
"
\n",
"
29
\n",
"
29. Asahi Group Holdings*
\n",
"
$19,677
\n",
"
1.70%
\n",
"
Wine & Spirits, Beverages, Food
\n",
"
\n",
"
\n",
"
30
\n",
"
30. BSH Hausgerate*
\n",
"
$19,667
\n",
"
-3.00%
\n",
"
Housewares/Appliances
\n",
"
\n",
"
\n",
"
31
\n",
"
31. San Miguel Corp.*
\n",
"
$19,652
\n",
"
24.10%
\n",
"
Wine & Spirits, Food
\n",
"
\n",
"
\n",
"
32
\n",
"
32. Kimberly-Clark Corp.
\n",
"
$18,486
\n",
"
0.80%
\n",
"
Household Goods
\n",
"
\n",
"
\n",
"
33
\n",
"
33. Kirin Holdings*
\n",
"
$17,916
\n",
"
3.60%
\n",
"
Wine & Spirits
\n",
"
\n",
"
\n",
"
34
\n",
"
34. Associated British Foods*
\n",
"
$17,164
\n",
"
1.40%
\n",
"
Food
\n",
"
\n",
"
\n",
"
35
\n",
"
35. General Mills
\n",
"
$15,740
\n",
"
0.80%
\n",
"
Food
\n",
"
\n",
"
\n",
"
36
\n",
"
36. Colgate-Palmolive Co.
\n",
"
$15,544
\n",
"
0.60%
\n",
"
Household Goods
\n",
"
\n",
"
\n",
"
37
\n",
"
37. Kering*
\n",
"
$15,060
\n",
"
26.30%
\n",
"
Apparel/Footwear/Accessories
\n",
"
\n",
"
\n",
"
38
\n",
"
38. Grupo Bimbo*
\n",
"
$14,765
\n",
"
7.80%
\n",
"
Food
\n",
"
\n",
"
\n",
"
39
\n",
"
39. Kao Corp.*
\n",
"
$13,995
\n",
"
1.30%
\n",
"
Household Goods, Health & Beauty Aids
\n",
"
\n",
"
\n",
"
40
\n",
"
40. Stanley Black & Decker
\n",
"
$13,982
\n",
"
7.80%
\n",
"
Housewares/Appliances
\n",
"
\n",
"
\n",
"
41
\n",
"
41. Johnson & Johnson (Consumer)
\n",
"
$13,853
\n",
"
1.80%
\n",
"
OTC Pharma, Household Goods, Health & Beauty Aids
\n",
"
\n",
"
\n",
"
42
\n",
"
42. VF Corp.
\n",
"
$13,849
\n",
"
17.30%
\n",
"
Apparel/Footwear/Accessories
\n",
"
\n",
"
\n",
"
43
\n",
"
43. Uni-President Enterprises*
\n",
"
$13,845
\n",
"
7.90%
\n",
"
Food
\n",
"
\n",
"
\n",
"
44
\n",
"
44. Estee Lauder Companies
\n",
"
$13,683
\n",
"
15.70%
\n",
"
Health & Beauty Aids
\n",
"
\n",
"
\n",
"
45
\n",
"
45. Kellogg Co.
\n",
"
$13,547
\n",
"
5.40%
\n",
"
Food
\n",
"
\n",
"
\n",
"
46
\n",
"
46. Diageo PLC*
\n",
"
$13,405
\n",
"
0.90%
\n",
"
Wine & Spirits
\n",
"
\n",
"
\n",
"
47
\n",
"
47. AB Electrolux*
\n",
"
$12,821
\n",
"
2.80%
\n",
"
Housewares/Appliances
\n",
"
\n",
"
\n",
"
48
\n",
"
48. RB*
\n",
"
$12,597
\n",
"
-0.20%
\n",
"
Household Goods
\n",
"
\n",
"
\n",
"
49
\n",
"
49. Essity*
\n",
"
$12,239
\n",
"
8.40%
\n",
"
Household Goods
\n",
"
\n",
"
\n",
"
50
\n",
"
50. Compagnie Financiere Richemont SA*
\n",
"
$12,100
\n",
"
3.10%
\n",
"
Apparel/Footwear/Accessories
\n",
"
\n",
"
\n",
"
51
\n",
"
51. Nipponham Group*
\n",
"
$11,779
\n",
"
5.60%
\n",
"
Food
\n",
"
\n",
"
\n",
"
52
\n",
"
52. Keurig Dr Pepper
\n",
"
$11,024
\n",
"
2.30%
\n",
"
Beverages
\n",
"
\n",
"
\n",
"
53
\n",
"
53. MolsonCoors Brewing Co.
\n",
"
$10,770
\n",
"
-2.10%
\n",
"
Wine & Spirits
\n",
"
\n",
"
\n",
"
54
\n",
"
54. Shiseido Co.*
\n",
"
$10,160
\n",
"
8.90%
\n",
"
Health & Beauty Aids
\n",
"
\n",
"
\n",
"
55
\n",
"
55. Pernod Ricard*
\n",
"
$9,905
\n",
"
-0.30%
\n",
"
Wine & Spirits
\n",
"
\n",
"
\n",
"
56
\n",
"
56. Nintendo Co.*
\n",
"
$9,797
\n",
"
115.80%
\n",
"
Toys & Games
\n",
"
\n",
"
\n",
"
57
\n",
"
57. China Mengniu Dairy Co.*
\n",
"
$9,691
\n",
"
14.70%
\n",
"
Food
\n",
"
\n",
"
\n",
"
58
\n",
"
58. PVH Corp.
\n",
"
$9,657
\n",
"
8.30%
\n",
"
Apparel/Footwear/Accessories
\n",
"
\n",
"
\n",
"
59
\n",
"
59. Hormel Foods
\n",
"
$9,546
\n",
"
4.10%
\n",
"
Food
\n",
"
\n",
"
\n",
"
60
\n",
"
60. Coty, Inc.
\n",
"
$9,398
\n",
"
22.80%
\n",
"
Health & Beauty Aids
\n",
"
\n",
"
\n",
"
61
\n",
"
61. Carlsberg A/S*
\n",
"
$9,233
\n",
"
3.00%
\n",
"
Wine & Spirits
\n",
"
\n",
"
\n",
"
62
\n",
"
62. Saputo Inc.*
\n",
"
$8,766
\n",
"
3.40%
\n",
"
Food
\n",
"
\n",
"
\n",
"
63
\n",
"
63. Campbell Soup Co.
\n",
"
$8,685
\n",
"
10.10%
\n",
"
Food, Beverages
\n",
"
\n",
"
\n",
"
64
\n",
"
64. Newell Brands
\n",
"
$8,631
\n",
"
-9.60%
\n",
"
Household Goods, Housewares/Appliances
\n",
"
\n",
"
\n",
"
65
\n",
"
65. Swatch Group SA*
\n",
"
$8,535
\n",
"
6.10%
\n",
"
Apparel/Footwear/Accessories
\n",
"
\n",
"
\n",
"
66
\n",
"
66. BRF - Brasil Foods*
\n",
"
$8,474
\n",
"
3.20%
\n",
"
Food
\n",
"
\n",
"
\n",
"
67
\n",
"
67. GlaxoSmithKline Consumer Healthcare*
\n",
"
$8,440
\n",
"
-1.20%
\n",
"
OTC Pharma
\n",
"
\n",
"
\n",
"
68
\n",
"
68. Beiersdorf AG*
\n",
"
$7,972
\n",
"
2.50%
\n",
"
Household Goods, Health & Beauty Aids
\n",
"
\n",
"
\n",
"
69
\n",
"
69. Conagra Brands
\n",
"
$7,938
\n",
"
1.40%
\n",
"
Food
\n",
"
\n",
"
\n",
"
70
\n",
"
70. Hershey Co.
\n",
"
$7,791
\n",
"
3.70%
\n",
"
Food, Confectionery
\n",
"
\n",
"
\n",
"
71
\n",
"
71. Dean Foods Co.
\n",
"
$7,755
\n",
"
-0.50%
\n",
"
Food, Beverages
\n",
"
\n",
"
\n",
"
72
\n",
"
72. Constellation Brands
\n",
"
$7,585
\n",
"
3.50%
\n",
"
Wine & Spirits
\n",
"
\n",
"
\n",
"
73
\n",
"
73. Groupe SEB*
\n",
"
$7,508
\n",
"
5.10%
\n",
"
Housewares/Appliances
\n",
"
\n",
"
\n",
"
74
\n",
"
74. Thai Beverage Public Co.*
\n",
"
$7,496
\n",
"
20.90%
\n",
"
Wine & Spirits, Food, Beverages
\n",
"
\n",
"
\n",
"
75
\n",
"
75. J.M. Smucker Co.
\n",
"
$7,357
\n",
"
-0.50%
\n",
"
Food
\n",
"
\n",
"
\n",
"
76
\n",
"
76. PT Gudang Garam*
\n",
"
$6,805
\n",
"
14.90%
\n",
"
Tobacco
\n",
"
\n",
"
\n",
"
77
\n",
"
77. Hanesbrands
\n",
"
$6,804
\n",
"
5.10%
\n",
"
Apparel/Footwear/Accessories
\n",
"
\n",
"
\n",
"
78
\n",
"
78. Hermes International*
\n",
"
$6,575
\n",
"
7.50%
\n",
"
Apparel/Footwear/Accessories
\n",
"
\n",
"
\n",
"
79
\n",
"
79. Unicharm Corp.*
\n",
"
$6,388
\n",
"
7.30%
\n",
"
Household Goods
\n",
"
\n",
"
\n",
"
80
\n",
"
80. Bandai Namco Holdings*
\n",
"
$6,295
\n",
"
9.40%
\n",
"
Toys & Games
\n",
"
\n",
"
\n",
"
81
\n",
"
81. Post Holdings
\n",
"
$6,257
\n",
"
19.80%
\n",
"
Food
\n",
"
\n",
"
\n",
"
82
\n",
"
82. Ralph Lauren Corp.
\n",
"
$6,182
\n",
"
-7.10%
\n",
"
Apparel/Footwear/Accessories
\n",
"
\n",
"
\n",
"
83
\n",
"
83. Clorox Co.
\n",
"
$6,124
\n",
"
2.50%
\n",
"
Household Goods
\n",
"
\n",
"
\n",
"
84
\n",
"
84. Bayer Consumer Health*
\n",
"
$6,006
\n",
"
-7.00%
\n",
"
OTC Pharma
\n",
"
\n",
"
\n",
"
85
\n",
"
85. Tapestry Inc.
\n",
"
$5,880
\n",
"
31.00%
\n",
"
Apparel/Footwear/Accessories
\n",
"
\n",
"
\n",
"
86
\n",
"
86. McCormick & Co.
\n",
"
$5,409
\n",
"
11.90%
\n",
"
Food
\n",
"
\n",
"
\n",
"
87
\n",
"
87. Savencia SA
\n",
"
$5,360
\n",
"
0.20%
\n",
"
Food
\n",
"
\n",
"
\n",
"
88
\n",
"
88. Kewpie Corp.*
\n",
"
$5,323
\n",
"
2.10%
\n",
"
Food
\n",
"
\n",
"
\n",
"
89
\n",
"
89. Avon Products
\n",
"
$5,248
\n",
"
-5.70%
\n",
"
Health & Beauty Aids, Apparel/Footwear/Accessories, Household Goods
\n",
"
\n",
"
\n",
"
90
\n",
"
90. Electronic Arts
\n",
"
$5,150
\n",
"
6.30%
\n",
"
Toys & Games
\n",
"
\n",
"
\n",
"
91
\n",
"
91. First Pacific Co.
\n",
"
$5,136
\n",
"
-1.90%
\n",
"
Food
\n",
"
\n",
"
\n",
"
92
\n",
"
92. ITC Ltd.*
\n",
"
$5,048
\n",
"
-22.00%
\n",
"
Tobacco, Food, Health & Beauty Aids
\n",
"
\n",
"
\n",
"
93
\n",
"
93. Herbalife Ltd.
\n",
"
$4,892
\n",
"
10.50%
\n",
"
Food, Household Goods, Health & Beauty Aids
\n",
"
\n",
"
\n",
"
94
\n",
"
94. Sapporo Holdings Ltd.*
\n",
"
$4,843
\n",
"
-5.40%
\n",
"
Wine & Spirits, Food, Beverages
\n",
"
\n",
"
\n",
"
95
\n",
"
95. Arcelik A.S.*
\n",
"
$4,665
\n",
"
29.10%
\n",
"
Housewares/Appliances
\n",
"
\n",
"
\n",
"
96
\n",
"
96. Hasbro Inc.
\n",
"
$4,580
\n",
"
-12.10%
\n",
"
Toys & Games
\n",
"
\n",
"
\n",
"
97
\n",
"
97. Mattel Inc.
\n",
"
$4,511
\n",
"
-7.60%
\n",
"
Toys & Games
\n",
"
\n",
"
\n",
"
98
\n",
"
98. Husqvarna AB*
\n",
"
$4,244
\n",
"
4.30%
\n",
"
Housewares/Appliances
\n",
"
\n",
"
\n",
"
99
\n",
"
99. Church & Dwight Co.
\n",
"
$4,146
\n",
"
9.80%
\n",
"
Household Goods, Health & Beauty Aids
\n",
"
\n",
"
\n",
"
100
\n",
"
100. Spectrum Brands Holdings
\n",
"
$3,809
\n",
"
2.80%
\n",
"
Household Goods, Housewares/Appliances
\n",
"
\n",
" \n",
"
\n"
],
"text/plain": [
""
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.style"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Cleaning the data\n",
"\n",
"We see that the table has a column with both the rank and the name of the company, and we want to have only the name of company in a column (as we have the rank in the index of the table). There are a fwe options to do it using string functions such as _RIGHT_ or _LEFT_. In this case, you could use:\n",
"\n",
"```excel\n",
"RIGHT(cell,LEN(cell)-SEARCH(\"char\", cell))\n",
"```\n",
"\n",
"In this example, we will use a powerful tool of regular expression (regex). We will not dive into the syntax of regex now, and it time you will find it easier to read or even write. "
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"clean_names_df = (\n",
" df\n",
" .assign(company_name = lambda x : x['Rank/Company'].str.extract(r'\\d+. ([^*]*)[*]?'))\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The next manipulation that we want to do is to have a numeric column with the annual revenues. The data in the table is in millions (which is better for presentation purposes). We will run two manipulations on the data:\n",
"* Start with the clean verion of the table data frame above\n",
"* Remove the dollar sign (_$_) and the comma (_,_) from the 'Net Revenue' Column\n",
"* Convert the above value to number and scale it to dollars by multipling by a million"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [],
"source": [
"clean_names_df = (\n",
" clean_names_df\n",
" .assign(annual_num = lambda x : x['Net Revenue ($M)'].replace('[\\$,]', '', regex=True))\n",
" .assign(annual_revenues = lambda x : pd.to_numeric(x.annual_num)*10**6)\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Rank/Company
\n",
"
Net Revenue ($M)
\n",
"
1-Year Sales Growth
\n",
"
Key Product Categories
\n",
"
company_name
\n",
"
annual_num
\n",
"
annual_revenues
\n",
"
\n",
" \n",
" \n",
"
\n",
"
1
\n",
"
1. Nestle SA*
\n",
"
$92,085
\n",
"
2.10%
\n",
"
Food, Beverage, Confectionery
\n",
"
Nestle SA
\n",
"
92085
\n",
"
92085000000
\n",
"
\n",
"
\n",
"
2
\n",
"
2. Procter & Gamble
\n",
"
$66,832
\n",
"
2.70%
\n",
"
Household Goods, Health & Beauty Aids, OTC Pharma
\n",
"
Procter & Gamble
\n",
"
66832
\n",
"
66832000000
\n",
"
\n",
"
\n",
"
3
\n",
"
3. PepsiCo
\n",
"
$64,661
\n",
"
1.80%
\n",
"
Food, Beverage
\n",
"
PepsiCo
\n",
"
64661
\n",
"
64661000000
\n",
"
\n",
"
\n",
"
4
\n",
"
4. Unilever N.V.*
\n",
"
$56,188
\n",
"
-5.10%
\n",
"
Household Goods, Food, Health & Beauty Aids
\n",
"
Unilever N.V.
\n",
"
56188
\n",
"
56188000000
\n",
"
\n",
"
\n",
"
5
\n",
"
5. Anheuser-Busch InBev
\n",
"
$54,619
\n",
"
-3.20%
\n",
"
Wine & Spirits
\n",
"
Anheuser-Busch InBev
\n",
"
54619
\n",
"
54619000000
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"0 Rank/Company Net Revenue ($M) 1-Year Sales Growth \\\n",
"1 1. Nestle SA* $92,085 2.10% \n",
"2 2. Procter & Gamble $66,832 2.70% \n",
"3 3. PepsiCo $64,661 1.80% \n",
"4 4. Unilever N.V.* $56,188 -5.10% \n",
"5 5. Anheuser-Busch InBev $54,619 -3.20% \n",
"\n",
"0 Key Product Categories company_name \\\n",
"1 Food, Beverage, Confectionery Nestle SA \n",
"2 Household Goods, Health & Beauty Aids, OTC Pharma Procter & Gamble \n",
"3 Food, Beverage PepsiCo \n",
"4 Household Goods, Food, Health & Beauty Aids Unilever N.V. \n",
"5 Wine & Spirits Anheuser-Busch InBev \n",
"\n",
"0 annual_num annual_revenues \n",
"1 92085 92085000000 \n",
"2 66832 66832000000 \n",
"3 64661 64661000000 \n",
"4 56188 56188000000 \n",
"5 54619 54619000000 "
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"clean_names_df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Enriching the data\n",
"\n",
"The next column we want to add to our table before importing it to our CRM system is domain of the company. We will need to use for it a search engine such as Google and a python library that is wrapping it to make it easy for us to use it. First, let's install the python library:\n"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Collecting google\n",
" Downloading google-3.0.0-py2.py3-none-any.whl (45 kB)\n",
"\u001b[K |████████████████████████████████| 45 kB 1.6 MB/s eta 0:00:01\n",
"\u001b[?25hCollecting beautifulsoup4\n",
" Downloading beautifulsoup4-4.10.0-py3-none-any.whl (97 kB)\n",
"\u001b[K |████████████████████████████████| 97 kB 4.6 MB/s eta 0:00:011\n",
"\u001b[?25hCollecting soupsieve>1.2\n",
" Downloading soupsieve-2.3.1-py3-none-any.whl (37 kB)\n",
"Installing collected packages: soupsieve, beautifulsoup4, google\n",
"Successfully installed beautifulsoup4-4.10.0 google-3.0.0 soupsieve-2.3.1\n",
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"pip install google"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, let's try it on one of the company names"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"https://www.nestle.com/\n"
]
}
],
"source": [
"from googlesearch import search\n",
"for url in search('Nestle', stop=1):\n",
" print(url)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The above cell is using a powerful concept in programming called _for-loop_ or _list-comprehesion_. However, we will try to minimize this programming concept and convert it to a simpler command that is returning the first value"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'https://shop.hasbro.com/en-us'"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"next(search('Hasbro Inc.', stop=1),'Nope')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Search Domain Function\n",
"\n",
"Now we can see the following function that will return to us the domain name of every company we will call it. The function is using various complicated concepts such as regex that we already saw, and _try-except_ part which is used to handle errors (what happened if we don't find a domain name). The function is also waiting 20 seconds before calling the search engine to avoid exceeding some limits that systems are putting against bots (which we just created). The _sleep_ will make the actual run of the data much longer, but it is still automated and we don't need to run this repeative task manually.\n",
"\n",
"\n",
"For now, we don't need to understand these details:\n",
"* Wait 20 seconds between calls to avoid throttling by Google\n",
"* Search for the website of the name of the company and return the first result\n",
"* Extract the format of the http or https of the above result\n",
"* Take the first match \n",
"* Return the match or 'NA' if not found"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [],
"source": [
"import re\n",
"import time\n",
"\n",
"def search_domain(company_name):\n",
" time.sleep(20)\n",
" try: \n",
" domain = next(search(str(company_name)+' website', stop=1))\n",
" m = re.search('https?://.*?\\.?([0-9a-z-]+\\.co[^/]+)/.*', domain)\n",
" answer = m.group(1) \n",
" except:\n",
" answer = 'NA'\n",
" return answer"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's test it with one of the compamy names"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'3m.com'"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"search_domain('3M co.')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now it is time to call our _search\\_domain_ function on all the rows in our table. This will take an hour or so, due to our _sleep_ time between the calls. For now, let's run it only on the head (first 5) of the table."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"clean_names_df_head = clean_names_df.head()\n",
"clean_names_df_head['domain'] = (\n",
" clean_names_df_head\n",
" .company_name\n",
" .apply(search_domain)\n",
")\n"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"